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We systematically analyze the primordial non-Gaussianity estimator used by the Wilkinson Mi¬ 
crowave Anisotropy Probe (WMAP) science team with the basic ideas of estimation theory in order 
to see if the limited Cosmic Microwave Background (CMB) data is being optimally utilized. The 
WMAP estimator is based on the implicit assumption that the CMB bispectrum, the harmonic 
transform of the three-point correlation function, contains all of the primordial non-Gaussianity 
information in a CMB map. We first demonstrate that the Signal-to-Noise ( S/N ) of an estimator 
based on CMB three-point correlation functions is significantly larger than the S/N of any estimator 
based on higher-order correlation functions; justifying our choice to focus on the three-point corre¬ 
lation function. We then conclude that the estimator based on the three-point correlation function, 
which was used by WMAP, is optimal, meaning it saturates the Cramer-Rao Inequality when the 
underlying CMB map is nearly Gaussian. We quantify this restriction by demonstrating that the 
suboptimal character of our estimator is proportonal to the square of the fiducial non-Gaussianity, 
which is already constrained to be extremely small, so we can consider the WMAP estimator to be 
optimal in practice. Our conclusions do not depend on the form of the primordial bispectrum, only 
on the observationally established weak levels of primordial non-Gaussianity. 

PACS numbers: 98.70.Vc, 98.80.-k 


I. INTRODUCTION 

The origin of the cosmological perturbations which led to the observed Cosmic Microwave Background (CMB) 
temperature anisotropies and large scale structure is one of the outstanding questions in cosmology. Unfortunately 
there are only a limited number of independent ways we can constrain the mechanism that produced these pertur¬ 
bations. Windows into the production mechanism of cosmological perturbations include a spectrum of primordial 
gravity waves, departures from a scale-invariant curvature power spectrum, isocurvature primordial perurbations and 
non-Gaussianity statistics of the primordial curvature perturbations. While each of these features of the primordial 
perturbations will change the observed CMB anisotropies and provide insight into the production mechanism of the 
primordial perturbations, we will focus on the non-Gaussian characteristics of the CMB anisotropies in this paper. 

The standard theory of inflation robustly predicts that the primordial curvature perturbations, and therefore the 
resulting CMB anisotropies, should be nearly Gaussian Q. By Gaussian we mean that all odd n-point correlation 
functions exactly vanish and all even n-point correlation functions can be completely expressed as the combination 
of two-point correlation functions. A non-Gaussian primordial curvature perturbation field would violate one of the 
above criteria. The degree to which any given Gaussianity criterion is violated can be characteristized by forming a 
natural ratio between the particular non-Gaussian correlation function and the appropriate combination of two-point 
corellation functions [2]. Since we will ultimately perform calculations with a non-Gaussian Probability Distribution 
Function (PDF) (from which all n-point correlation functions can be calculated), below we will provide a natural 
quantity that describes the degree of non-Gaussianity in terms of quantites of the PDF. Then we can define the 
condition nearly Gaussian to mean that this natural quantity is much less than one. Even though the standard 
calculation within inflation predicts extremely small amounts of non-Gaussianity, there are several viable mechanisms 
that can produce substantial non-Gaussian primordial curvature perturbations (see [§j and references there within). 
These processes not only predict different amplitudes of the total non-Gaussianity but also different functional forms 
0. Through linear gravitational and hydrodynamical evolution these curvature perturbations will produce CMB 
anisotropies with statistical properties that mirror the statistical properties of the primordial perturbations. Therefore 
it is possible to learn about primordial non-Gaussianity by studying higher order n-point correlation functions of the 
CMB anisotropies. 

There are more potential sources of non-Gaussianity in the CMB than just primordial non-Gaussianity. The 
non-linearities in the gravitational and hydrodynamical equations of motion for the baryon-photon fluid prior to 
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recombination can produce non-Gaussianity a i 0 Secondary anisotropies, such as, the thermal and kinetic 
Sunyaev-Zeldovich effects, gravitational lensing and the Ostriker-Vishniac effect can all produce non-Gaussianity in 
the CMB HUES- However we are most interested in an observation of primordial non-Gaussianity in the CMB 
because this would require new ideas for the production of the primordial curvature perturbations and may give a 
glimpse into beyond the Standard Model physics. It has been demonstrated that the primordial non-Gaussian signal 
can be separated from non-Gaussian secondary anisotropies on scales relevant for WMAP and Planck EH- 

As stated above we expect the non-Gaussianity of the CMB to be extremely small. Instead of trying to detect 
individual modes of a non-Gaussian correlation function |.87l| . which would allow us to the examine the functional 
dependence of momentum wavevectors and better understand the mechanisms that produced the non-Gaussianity 
0, it is customary to parameterize the primordial non-Gaussianity with a model. Thus we will be able to combine 
many bispectra modes and therefore increase the statistical significance of our conclusions. The standard model for 
primordial non-Gaussianity is the “Local Model”; the primordial curvature perturbations in real-space are expressed 
as 


*(*) = *(*) + fNLlHx) 2 - (<h(a:) 2 )], (1) 

where <E>(cc) is a Gaussian field with zero mean and covariance matrix C. The “Local Model,” even though an 
idealization, has a strong physical motivation; during inflation the non-linear couplings of general relativity will 
produce “Local Model” terms in the bispectrum, the harmonic transform of the connected three-point function Q. 
Standard measures of the similarity of two bispectra imply that the “Local Model” and the standard inflationary 
calculation are nearly identical [5j. Moreover, models where non-linearities develop outside the horizon, such as the 
curvaton model 0 and the inhomogeneous reheating models Emi, will produce bispectra identical to the “Local 
Model.” This model will allow us to place limits on the cumulative amplitude of all bispectrum modes and a method 
has been developed to translate these constraints into constraints on other models of primordial non-Gaussianity p|. 

The characteristic amplitude of non-Gaussianity can be found by rescaling T to have unit variance. The coefficient 
of the rescaled quadratic term, /jvz,^^) 2 ) 1 ' 2 , is the natural measure of the amplitude of non-Gaussianity. Below 
we will find that the Signal-to-Noise of estimators constructed out of non-Gaussian three-point correlation functions 
will always contain this factor (or related quantities for higher order non-Gaussian n-point correlation functions). 
The best current data constrains fNL(^(x) 2 ) 1 ^ 2 < 3.5 x 10 -3 (95% C.L.) 0- The smallness of this expansion 
parameter not only allows us to truncate the expansion of the definition of the non-Gaussian field, Eq. ©, but it also 
implies that the characteristic amplitude of the bispectrum will be much larger than the characteristic amplitude of 
the trispectrum, the harmonic transform of the connected four-point function. By rescaling arguments analogous to 
above, the characteristic amplitude of the trispectrum, either due to an additional cubic term (f^(x) 3 ) or from the 
quadratic piece (/atl'I > (®) 2 ), would be proportional to {f% L + fa)($(x) 2 ). So unless fa is much larger than /nl, the 
characteristic amplitude of the bispectrum will be much larger than the characteristic amplitude of the trispectrum. 

We are not simply interested in the characteristic amplitude of a higher-order correlation function, but in the 
cummulative Signal-to-Noise ( S/N ) of an estimator based on that higher-order correlation function. The number of 
relevant graphs grows quickly as the order of the correlation function increases, for example the number of trispectrum 
quadrilaterals is much greater than the number of bispectrum triangles. As demonstrated in Appendix A, if the 
primordial power spectrum of the Gaussian field is scale-invariant, then the (S/N) 2 of a non-Gaussianity estimator 
based on any non-Gaussian correlation function, regardless of order, will simply scale with the number of observed 
pixels ( Npi X ), not the number of graphs. Ignoring combinatorial factors we find, for an estimator constructed out of 
the n - point correlation function, that 

(S/N) n ~ (fn — l + fn-2 + ■ ■ ■ + fr 2 )(Hx) 2 ) (n ~ 2)/2 VN^. ( 2 ) 

Since ($(a;) 2 ) 1 / 2 ~ 10 -5 the estimator based on the three-point correlation function (/nl = fa) will dominate over all 
higher-order estimators. This implies that the three-point correlation function is the most effective means to constrain 
Jnl and that it will be significantly easier to constrain /nl than any higher-order f n . Thus we should begin to explore 
primordial non-Gaussianity by constraining the amplitude of the bispectrum. 

These conclusions depend upon the naturalness argument that fn, which implies the S/N of the various 

estimators will be supressed by increasing factors of ($(a:) 2 ) 1 / 2 , which we believe to be approximately 10 -5 . There 
are several reasons to be cautious. First of all, there might be some reason due to symmetry that f^L = 0, then 
the primordial bispectrum will vanish. Also in the inhomogeneous reheating models, inefficiency in the production of 
the gravitational potential could equally supresses all terms in expansion of the non-Gaussian field; therefore Eq. © 
would become 


^(*) = A[$(®) + f NL [$(x ) 2 - ($(*) 2 )]], 


( 3 ) 
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where A < 1 0- This mutual suppression would increase the inferred value of ($( x ) 2 ) 1 / 2 and cause the suppression 
of the S/N of higher-order estimators to be less drastic. However within the standard slow-role single field inflationary 
scenario jl] our conclusions are quite robust. 

The standard inflationary scenario for the generation of primordial curvature perturbations predicts that these 
perturbations were quantum mechanically produced. While it is impossible for us to predict the primordial curvature 
fluctuation at any point in the universe, it is possible to predict the statistical properties of these perturbations. 
With these statistical properties we can create a PDF from which we can generate random realizations of the non- 
Gaussian curvature perturbations. Since the CMB anisotropies were generated through the linear gravitational 
and hydrodynamical evolution of these primordial curvature perturbations, the statistical properties of the CMB 
anisotropies will mirror the statistical properties of the primordial curvature perturbations. Therefore we can view 
the data from a given CMB experiment as random samples drawn from the appropriate CMB PDF derived from the 
Local Model PDF. 

We have an explicit expression for the primordial curvature perturbations, Eq. 0, so we can write down the exact 
PDF for this random field 

P(Wnl,C) = J d N ^ N \^(x) - $(®) - f NL [<f>(x) 2 - ($(®) 2 )])P(d>|C), (4) 


and we also include the assumption that <f> has a Gaussian distribution 


P(<F|C) 


e -4> T C _1 $/2 

\J(2tt) n detC 


(5) 


Here we have adopted standard notation for functionals and compactly written the quadratic form of functionals 
in vector notation. Depending on the particular situation, the quantity could either be a sum over a 

discrete set of eigenfunctions (for example, observations of CMB anisotropies <E> T C -1 <h = ^2im a *Lm a im/Ci ) or an 
integral over a continuous set of eigenfunctions (for example, observations of primordial curvature perturbations 
<1> T C -1 <f> = J d 3 fe < f , *(fc) < F(fc)/P(fc)). Assuming that the Local Model is correct and the PDF in Eq. 0 properly 
represents nature, we can ask how well the data sample will allow us to constrain the underlying parameters of 
the PDF (Jnl,C). There are standard tools from the field of statistical estimation theory that will help us rate 
estimation procedures and determine the smallest possible error bars we can place on a parameter with a given set of 
data [Hill EES El. 

The purpose of the paper is to determine what are the smallest error bar possible to place on / ,y l with a given 
data set and which estimation procedure will allow us to place these constraints. The entire set of measured n- 
point correlation functions contains all observable information on the underlying PDF contained in the given data 
set. It may be possible to find a simple estimator which retains all of the available information without undergoing 
the time consuming process of measuring all n-point correlation functions. Currently researchers are using several 
different techniques to estimate the non-Gaussianity of a CMB map. An estimator based on the CMB three-point 
correlation function has been used to constrain the non-Gaussianity (/ nl ) of the results from the Wilkinson Microwave 
Anisotropy Probe (WMAP) [3, 42, Very Small Array (VSA) [23j and MAXIMA |24} CMB experiments. In addition, 
Minkowski Functionals ,25] and the correlations of Fourier phases (see Efj and references within) have been used 
to characterize non-Gaussianity. We will demonstrate, that for weak levels of non-Gaussianity, the estimator based on 
the three-point function contains all possible information on f^L ■ In particular we will show that the exact estimator 
used by WMAP is an optimal estimator. Therefore the procedure adopted by WMAP will, in principle, provide 
the best possible error bars on /nl- By using this estimator none of the information potentially contained in the 
higher order n-point functions is lost. This conclusion is equivalent to the statement that an estimator based on the 
two-point correlation function contains all possible information on the power spectrum, even though higher order 
even n-point correlations functions also contain information on the power spectrum. Finally we will argue that our 
conclusion does not depend on any characteristic of the Local Model, just on the observationally established weak 
levels of non-Gaussianity, so the amplitude of any primordial bispectrum can be optimally constrained by estimators 
based on the CMB three-point correlation function. 

The calculations in this paper assume that the CMB map is free of both instrument noise and Galactic foreground 
contamination. It is straightforward to include the effects of noise in our estimator for /nl- However the statistical 
estimation of individual bispectrum modes is very complicated for maps with inhomogeneous noise or a Galactic sky 
cut. The simple calculation of individual bispectrum modes by combining the appropriate coefficient of the spherical 
harmonic decomposition of the observed CMB map will result in sub-optimal estimates for the bispectrum modes and 
therefore sub-optimal estimates for /nl- An optimal estimator for individual bispectrum modes, including the effects 
of inhomogeneous instrument noise, has been developed |28j . In this paper we treat the part of the problem which 
involves the estimation of /Ax from measured bispectrum modes. 
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In Section II we introduce the notion of an optimal estimator and the Cramer-Rao Inequality and show how they are 
related to the Maximum Likelihood Estimator. In Section III we analyze a Poisson random field with the appropriate 
Local Model non-Gaussian PDF as a simple example of the essential ideas. In Section IV we extend our analysis to 
the scale-invariant distribution predicted by inflation and generalize our results to arbitrary primordial bispectrum in 
Section V. In Section VI we summarize our results. 


II. ESTIMATION THEORY 

Estimation theory is the branch of statistics developed in order to analyze the procedures used to constrain the 
underlying continuous parameters of a PDF. Assuming that a given data set is drawn from a known PDF with 
unknown fixed parameters it allows us to determine the minimum error bars attainable on those parameters using 
the data. In the following subsections we develop a simple sufficient and necessary condition for a PDF to admit an 
estimator that saturates the famous Cramer-Rao Inequality. 

Then for completeness we will discuss the relationship between these concepts and the popular Maximum Likelihood 
Estimator (MLE). We will only outline necessary concepts, the interested reader should consult M MM IHIH 
for more details. 


A. Cramer-Rao Inequality 

In this subsection we discuss the Cramer-Rao Inequality, which determines the lower bound for error bars that 
we can place on a parameter with a given data sample. These minimum errors bars derived from the Cramer-Rao 
Inequality are valid only for the “frequentist,” and not the Bayesian, understanding of statistical estimation (see 
f° r a discussion of these differing understandings of probability and estimation). An estimator that 
saturates this inequality is dubbed optimal since it weights the data in the best possible manner. Unfortunately for 
a general PDF no optimal estimator will exist. In order to develop some intuition regarding the information content 
of a data sample we will derive the Cramer-Rao Inequality for a scalar parameter. The generalization to multiple 
parameters is conceptly straightforward, but requires linear algebra which obscures some of the essential insights. 

We let A(a:) be an unbiased estimator of A, the parameter we are trying to estimate from our data sample. Assuming 
the regularity condition 

/ = 0 , ( 6 ) 

which holds when we can interchange the order of integration and differentiation, we can use this property to derive 
the following identity 

J(A(x) - X) dlnp ^ X \ {x\X)dx = 1. (7) 

Using the Schwarz Inequality, (A ■ B ) 2 < (A ■ A)(B ■ B ) in the notation of linear algebra, we find 

[ J(A(x) - X) dlnp ^ X \ {x\\)dx] 2 < [ J (A(®) - X) 2 p(x\X)dx] [J( dlnp ^\ ) 2 p ( x \X)dx\. (8) 

Now defining the Fisher Information as 

m = j ( dlnP g* lX) )Mx\ X)dx, (9) 

and using Eq. 0 we obtain the Cramer-Rao Inequality 

Var(A(a;)) = [J(A(x) - X) 2 p{x\X)dx\ > —(10) 

The Cramer-Rao Inequality states that no estimator of A can produce error bars smaller than \/F(X). 

More importantly we can identify a necessary and sufficient condition for the variance of A (a:) to saturate the 
Cramer-Rao Inequality. It is clear that the Schwarz Inequality in Eq. 0 will be saturated if and only if 

91np(ai|A) 


dX 


F(X)(A(x) — A) 


( 11 ) 
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thus the PDF must be able to be written in this form if it will admit an optimal estimator. In what follows, we will 
analyze the Local Model PDF to determine if it can be expressed in this form. 

There is a multiparameter generalization of the Fisher Information defined in Eq. 


n p{xWdx , 


dXt 


OXn 


( 12 ) 


If we define Cy as the covariance matrix of the set of parameters, the Cramer-Rao Inequality becomes the statement 
that C — T is a positive semidefmite matrix. This implies the more familiar statement 

Var(Xi) = Ca > . (13) 


If all off diagonal terms in Cy vanish then we can independently estimate all A^ and the multiparmeter Cramer-Rao 
Inequality reduces the single parameter case. 


B. Relationship with the Maximum Likelihood Estimator 


Previous work on the analysis of data focused on the Maximum Likelihood Estimator (MLE). Here we will discuss 
the relationship between the more familiar MLE and the optimal estimator that we introduced above. The basic 
principle of the MLE is to consider the observed data as fixed and consequentially choose A in order to maximize the 
probability of observing the fixed data. When we take this approach the PDF will be refered to as the likelihood 
function. We must choose A in order to maximize the likelihood function, so equivalently we demand 


<91np(a;|A) 

dX 


\X—X ml 


= 0 , 


(14) 


then Xml is the estimated value of A. Notice that this equality is true for all data realizations and not just for 
ensemble averages. In general Eq. Cl will be a complicated non-linear equation, but there are standard techniques 
to solve such equations. The most popular is the Newton-Raphson method which is widely used in cosmology for 
likelihood estimation of the CMB power spectrum from observed CMB maps HU 03 ■ 

This approach is widely used in practice because it is always possible to implement, whereas the approach described 
above often does not yield an estimator. In addition, the MLE is asymptotically optimal and unbiased, meaning that 
as the amount of data increases the MLE approaches the correct answer with error bars equal to those predicted by the 
Cramer-Rao Inequality. Whenever an optimal estimator exists, the optimal estimator is also the MLE. This is clear 
from the necessary and sufficient condition for the existence of an optimal estimator, Eq. im which automatically 
satisfies the definition of the MLE, Eq. ®. However the converse does not hold and in general, for finite amounts 
of data, the MLE is not optimal. 


III. POISSON DISTRIBUTIONS 

In order to gain intuition we will first analyze a Poisson random field with the appropriate non-Gaussian PDF. This 
implies that each point in space is uncorrelated with one another and will be independently sampled from the non- 
Gaussian Local Model PDF. Since we measure N independent and identically distributed (HD) random variables we 
can simply scale the single pixel Fisher Matrix by N. With these assumptions, the PDF for the single pixel primordial 
curvature perturbation, Eq. ®, becomes 

r e -$ 2 / 2 /P 

JW*L,A*) = J - f NL [<$> 2 - A* 2 ]) - , (15) 

where /.t 2 = ($ 2 ). Given an observed T there are two possible values of the underlying 4> 

$± = 777 —[±\/l + 4 / jvl (^ + InlH 2 ) - 1]- (16) 

AJNL 

Integrating over the delta-function the PDF becomes 

l e -$_/2/P 

PWfNInU) = ^^[1 + 2 /^$+ + 1 + 2 f NL $J' 


(17) 
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In the weakly non-Gaussian limit, JnlH 1, the contribution from *!>_ is exponentially suppressed so we will 
ignore it in what follows; we can then write 

logP(^|/jvz,,£t) = - ln(l + 2f NL <5> + ) - ln/i, (18) 

which can be expanded in a power series 


log P('&\f NL t A 4 ) = 


d' 2 + 2/i 2 ln/i 2 f NL 


/. 


+ ^ 3/i 2 <P) - ^%(5<k 4 + 5/i 4 - 14/i 2 * 2 ) + OUnl)- 


2/i 2 




2/i : 


(19) 


If we rewrite log P such that the power series is expressed in terms of 'k//t, which is a random variable with unit 
variance, then we discover that we are actually expanding in the quantity /tvl/c Doing so we can write Eq. m as 

log = --To(^/m) + fNLlih{^/n) - /fj) + c>(/| ri /i 3 ), (20) 

where Iq , I\ and I 2 are defined with respect to Eq. m - The expectation value of the first-order piece is 

(h)=6f NL ^ + 0(f NL ^), (21) 

and the second-order piece is 

(I 2 ) = 6 + 272 f NL ^ 2 + 0(f 4 NLf i 4 ). (22) 


Thus it is necessary to keep the second order term, I 2 , in the expansion of logP. However we can ignore the non- 
Gaussian piece of (I 2 ), which is a factor of /jv^/i 2 smaller than the Gaussian piece of ( I 2 ). Our conclusions will 
depend on this approximation, which is equivalent to the weakly non-Gaussian approximation typically made in the 
literature. 

If we view the PDF as a likelihood function, as described above, only I\ contains information on /nl to 0(fff L /j, 4 ). 
Since we only need to calculate I\ from the observed data to specify the likelihood we can regard this quantity as an 
estimator for Jnl\ we will address the need to specify I 2 below. Now we will analyze the statistical properties of this 
estimator. 

After choosing the normalization in order to unbias the estimator we find 

f NL = -^^ 3 -3^). (23) 

Once we include all N independent observations, this estimator becomes 

1 N 

fNL = e, TvS'**- 3 ** 2 **). < 24 > 

which has the variance 

Var(f NL ) = —+ 0(f% L n 4 ). (25) 

The PDF defined in Eq. lt*H)l) satisfies the regularity condition, Eq. ©. for both f^L and /t so we can use the 
Cramer-Rao Inequality to give the lower bound on the error bars of these parameters. This lower bound will be 
different than the variance of Jnl if the PDF does not satisfy the appropriate necessary and sufficient condition for 
a PDF to admit an optimal estimator, Eq. CD- Simple inspection of the single pixel PDF, Eq. C0 

dlUP Qf^ L NL) = 6 AfNL^) ^ W/x)] + 0(f NL /i 2 ), (26) 

shows that the necessary and sufficient condition of Eq. o is strictly met only if /jvl = 0. Thus our estimator 
/jvl(v k) is only optimal for setting non-Gaussian limits on Gaussian maps. 

The PDF does not satisfy the appropriate conditions for the existance of an optimal estimator because the function 
^(d'/zi) multiplies Jnl- When we evaluate ( I 2 ), we find that there is a leading order Gaussian piece and a non- 
Gaussian piece, suppressed by a factor of /^ L /i 2 . If we replace I 2 with its expectation value, ignoring the non-Gaussian 
piece, we find 

din p(J\f NL ) = ^ _ Jnl]j (27) 
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which is exactly the condition for an optimal estimator to exist. While our original conclusion, that T 3 — is only 
optimal for underlying Gaussian distributions, is still true; we can clearly see that within the weakly non-Gaussian 
limit our estimator is optimal. If the estimator was optimal its variance would exactly equal the bound derived from 
the Fisher Matrix. For the non-Gaussian underlying distribution (the inverse of) the Fisher Matrix, Eq. 11261) . and 
the estimator variance, Eq. <H3i, differ by terms proportional to f% L p?- This is precisely the type of term that 
can be ignored within the weakly non-Gaussian approximation. Therefore assuming this approximation is valid, the 
estimator d' 3 — 3/i 2| F is optimal even for underlying non-Gaussian distributions. 

This is not coincidence, but a direct consequence of the regularity condition, Eq. ®, and the weakly non-Gaussian 
approximation. Since the regularity condition must hold for all values of f nl,i we can require it to hold term-by-term 
in our expansion in Jnl- The first-order terms in the regularity condition are ( Ii)ng ~ Jnl{I 2 )g > where G and 
NG denote the Gaussian and non-Gaussian expectation values, respectively. Since we know the first-order terms 
must vanish, we can infer ( I\)ng = / nl{I 2 )g■ Thus if we can replace I 2 with its Gaussian expectation value then 
/jvl = 7i/(/ 2 )g will be an unbiased optimal estimator with variance 1/(/ 2 )g- 

Our argument depends on being able to replace I 2 with its expectation value, clearly when the sample size grows 
the error introduced by making this assumption vanishes. Now we will calculate conditions on the sample size for this 
property to hold. The expectation value will have a variance which decreases inversely with the sample size, N, 


Var{ 


6 ’ 


287 

9AT’ 


(28) 


thus we need N 287/9 for our estimator to be optimal. This number is simply due to combinatorial factors and 
therefore we do not expect it to drastically change when we consider the full problem with radiative transfer and a 
scale-invariant distribution. Moreover WMAP observes approximately 10 6 pixels, so we expect this approximation to 
be quite good. 

To check our conclusions, we should still calculate the Fisher Matrix elements and compare them with the variance 
of /jvl- After scaling our results, since we have N IID samples, we find to 0(f^ L p 3 ) 


p _ N f 6^ 2 + 368 f 2 NL \x 3 -8f NL p \ 

V 2 / m 2 + 2 °f 2 NL J ’ 


(29) 


where A = (/jvl,m)- The errors on the parameters are given by the inverse of the Fisher Matrix 

p-i = 1 f V6/^ 2 - 28/ 2 l /3 2f NL p/3 \ 

N \ 2f NL p/3 /r 2 /2 - 7/ 2 iM 4 /3 ) ’ 


(30) 


thus the Cramer-Rao bound on /jvl is smaller than the variance of /jvl except when our underlying map is Gaussian. 
The fractional discrepancy caused by underlying non-Gaussianity is 


Var(fNL) - (F ff ) 1 
Var(fwL) 


170/^jlM 2 < 2.3 x IQ” 3 


(31) 


assuming, /jvl/t A 3.5 x 10 , which is the best current constraint. Clearly we are justified in ignoring the 

corrections and in considering the three-point function to be optimal even when the underlying non-Gaussianity is 
non-zero. 

We can extend this discussion by removing the assumption that p is known a priori , then we must simultaneously 
estimate /jvl and /i. Dropping this assumption we will search for an estimator of /jvl that jointly estimates /r. 
Actually it is most natural to estimate A = /jvl^ 4 , using this variable we find that the coupled unbiased estimator is 


A= -- 


N 


6 N — 3 IV ^ 


(i 




(32) 


with variance 


Var( A) 


p 6 N 2 — 3N + 12 
6 N (N — 3) 2 


(33) 


At N = 100, the variance of A is increased by just 3% with respect to /jvl- In general the simultaneous estimation of 
parameters that are correlated will increase the estimator variance. However the variance will asymptotically approach 
the single parameter estimator variance as the size of the data sample grows. 






IV. SCALE-INVARIANT LOCAL MODEL NON-GAUSSIAN DISTRIBUTIONS 


The purpose of this section is to determine whether an optimal estimator for /jvl exists when we change the 
underlying distribution from Poisson to scale-invariant. Observations of the CMB and large scale structure imply 
that the primordial spectrum of curvature perturbations is scale-invariant; therefore, we must adapt the discussion of 
the previous section. The Poisson distribution covariance matrix is diagonal in all bases; however, the real-space basis 
is convenient because the “Local Model” definition of the non-Gaussian held does not mix different modes in this 
basis. Therefore in the real-space basis we were able to analyze a single pixel and appropriately scale the final results. 
For a scale-invariant distribution the covariance matrix is no longer diagonal in a real-space basis, but only in a 
Fourier basis. Moreover we observe the primordial curvature perturbations projected on the sky as CMB anisotropies 
after hydrodynamical and gravitational evolution. The covariance matrix of the CMB anisotropies is diagonal in a 
spherical harmonic basis. 

Taking these features into account we will determine if the bispectrum, the three-point correlation function in the 
spherical harmonic basis, contains all of the non-Gaussian information of a CMB map. This problem will be divided 
into two parts: (1) We will analyze a 2-D scale-invariant distribution without radiative transfer in a spherical harmonic 
basis; (2) We will include the effects of radiative transfer and show that the WMAP non-Gaussianity estimator is an 
optimal estimator for Jnl- 


A. Sachs-Wolfe Effect 

At first we will ignore the effects of radiative transfer and consider scale-invariant primordial curvature perturbations 
projected onto the sky. This essentially occurs on large angular scales where the Sachs-Wolfe effect directly maps the 
curvature perturbations onto temperature anisotropies. However, the main purpose of this subsection is simply to 
show the changes in the PDF as we switch from an underlying Poisson distribution to a scale-invariant one. 

The two-point correlation function of the projected Gaussian curvature field is 

C = [ k 2 dkP(k) J ' (k J D \ (34) 

TT J 9 

where td is the distance to the surface of last scattering. For a scale-invariant power spectrum, P{k) = A/k 3 , can be 
exactly evaluated as 


dm,m' Dl — $1,1' dm,r. 


A 1 

97 T 1(1 + 1) 


Now defining the Gaunt Integral as 


tmimams ~ J ^ (™)i(sm3 (^) 


(2li + 1 )(2Z2 + 1 )( 2?3 + 1 ) f li l 2 I3 \ / h I2 I3 

47t V 0 0 0 / l mi TO 2 m 3 J ’ 


and the average of the quadratic curvature fluctuation as 

<* 2 (*» = f 2 = 


(35) 


(36) 

(37) 


(38) 


we can follow the above steps to expand the non-Gaussian PDF in powers of JnlH- However, we must first calculate 
the Jacobian transformation of the delta-function 


Jlm,ab — 


dgir, 

d<l\, 


= Sl, a Sm,b-fNL(-l) m Y Q, 


d 


llll2 

mm 1 rri 2 qq 


"^17711^2 7712 J 


h,h, 7711, m 2 

= -5l,aS m , b -2f NL (-l) m Y, ^VAn.^Ot/V), 


Zl,771i 


(39) 

(40) 


where gi m is the argument of the delta-function, used to define the Local Model PDF in Eq. 0 , expressed in a 
spherical harmonic basis. 
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For the Local Model non-Gaussian PDF we find 


]ogP(*\f NL ,C) = ^ + fNL* T C-\& - m 2 ) - logdet + 0{f% Lf i 2 ), 


which expressed in a spherical harmonic basis is 
lQgP(WNL,C) = 


2 ^ L> ; 

lm 


d<S> 


f \ /7LLL ^l m l ,T( lT/ 

JNL / J ^171117121713 ti 2^2 *^ 3 m 3 

(Z,m) ^ 


- /jVL 


P^oo 

D n 


9 5 (0>) 


- logdet ^ ' + OUnl^ 2 )- 


(41) 


(42) 


The notation (/,m) is meant to imply that the sum is over all three U and nii. It is clear that rotational invariance, 
enforced through the Gaunt Integral selection rules, forces the terms linear in 4/ to only contain the monopole term, 
^oo- 

We can simplify matters by examining the derivative of the expression in Eq. ©J; using the standard matrix 
identity, Log Det A = Tr Log A, to rewrite the term that comes from the Jacobian transformation of the delta- 
function, we find 


<91ogP(T|./W L ,C) lTfT 


dJ 


dfNL 


= V' 1 C ^(T 2 - li 2 ) - Tr[J~ 1 ^-] + OifNLfj), 


dfNL 


which expressed in the spherical harmonic basis is 


dlogP(^\f NL ,C) y- V, . P^oo 0 , T , n(f A 

dfNL / j ^mim2m3 ^/2 m 2^^3 m 3 2^00 H” ^d\JNL^) • 

(Z,m) 1 


Using our intuition from the Poisson case we can identify 


f' nl (’£) = E 
(Z,ra) 


2h $i imi , Tt , T , ^00 (r,,,2 

m 2 m 3 jj 


^hmNhms - — (2/^ + Do), 
h Do 


(43) 


(44) 


(45) 


as the scale-invariant generalization of the estimator for f nl ■ There are notable differences between the scale-invariant 
case and the Poisson case. Fundamentally, these differences are a result of expressing the PDF in a spherical harmonic 
basis instead of a real-space basis. The assumption of scale-invariance only affects the form of Di. 

Rotational invariance forces the linear terms to be proportional to the monopole anisotropy modes. Recall that 
the CMB monopole anisotropy is unmeasurable in principle and the primordial dipole anisotropy is dominated by 
the Doppler effect due to the local kinematic motion of the galaxy, so these modes are typically removed from the 
data. Labelling the original data as T° and the new data with the monopole and dipole modes eliminated as T 1 , the 
projection operator will change the PDF of the original data as 

P(4- 1 ) = J (46) 

where M is the relevant projection matrix. For all non-monopole and dipole modes the effect of the functional 
integration will simply be to replace 4/° with T 1 . Since the projection matrix eliminates the monopole and dipole 
modes in the argument of the delta-function, the monopole and dipole terms are simply integrated out. This functional 
integration simply eliminates the term in P(T) which is linear in Too- 

Moreover this functional integration also removes all terms from the cubic piece that contain any Too or Ti m 
modes. It is clear that any term linear or cubic in a monopole or dipole mode will vanish. However it is possible that 
terms such as TooToo4/; m or TnTi_iT; m , which survive the functional integration, might introduce a new linear 
term containing higher modes. Fortunately the symmetry properties of the Gaunt Integral eliminate these terms since 
Gm-mo <Vo (see Appendix B of 01). Therefore the projection of the monopole and dipole eliminates all terms 
from the PDF which contain any monopole or dipole modes and importantly does not introduce a new linear term. 
After performing this projection the new estimator is 

W*) = E (47) 

(l,m) h 


where now the sum excludes all modes with h < 1. 
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We must decide whether this estimator contains all the information of the modified PDF. In analogy to the Poisson 
case, the f^ L term is an even function of the 4 >i m modes. Therefore its expectation value will have a purely Gaussian 
piece which is fixed by the regularity condition to be precisely the correct value for the scale-invariant PDF to satsify 
Eq. 11111 and admit an optimal estimator. Again we conclude that /jvi( 1 h) is an optimal estimator in the weakly 
non-Gaussian limit. 


B. Radiative Transfer 

Now we will include the effects of radiative transfer and we will find that the 2-D CMB version of our optimal 
estimator is the Weiner Filter estimator used by the WMAP science team [22. The process of radiative transfer alters 
the amplitudes of the 3-D curvature perturbations and projects them onto 2-D CMB temperature anisotropies. This 
process will relate the CMB PDF to the Local Model PDF as 

P(a\f NL ) = J d N ^8^ M \a lm - I r 2 dr ai (r)* lm (r))P(*\f NL ), (48) 

where M < N because of the projection. The additional spurious degrees of freedom, which do not affect the 
observable CMB anisotropies, can be integrated out. Here we define 

ui(r) = — J k 2 dkji(kr)Ai(k), (49) 

where A i(k) is the standard radiation transfer function |3 ll ). We can connect the present discussion with previous 
case of the Sachs-Wolfe effect by noting that the formulae of the previous subsection can be reproduced by choosing 

ai(r) = - [ k 2 dkji(kr)\ji{kT D ). (50) 

TT J 3 

By substituting this expression in the following equations the results in the previous subsection can be derived. This 
formula is valid on large angular scales when the wavelength of the relevant primordial curvature perturbation is much 
larger than the size of the sound horizon. 

To calculate this functional integral we will use the exponentiated form of the delta-function 

J r 2 drA z (r)T im (r)) = J (51) 

here Bi m is simply a “dummy variable.” Using this representation we can “complete the square” and perform the 
functional Gaussian integrations. 

First we must find the 3-D Local Model PDF, the covariance matrix between two primordial 3-D curvature pertur¬ 
bations can be calculated as 


C = ( $ i imi (n)$i 2 m 2 (f 2 )) = 8i lt i 2 8 mitm2 Di 1 (r 1 ,r 2 ) (52) 

= Si lt i 2 5 mi ,m 2 - J k 2 dkP(k)ji 1 (kr 1 )j h (kr 2 ). (53) 

Since we will need the inverse of C in order to express the PDF of we can symbolically define the inverse of Di(r i, 7 * 2 ) 
as 


[ r 2 drDi(n,r)D l 1 (r,r 2 ) = ^ ri 2 

J r i 


(54) 


Using these definitions the non-Gaussian PDF can be expanded as 
logP(^|/jvL,C) = log P g (Wnl,C) + log P ng (Wnl,C) 


= T. f rldr 1 rldr 2 {-l) rn ^i-m{r 1 )D l 1 (r 1 ,r 2 )^im{r 2 ) 

l,m 

+ fNL j r l dr ^ r ldr2Q l ^ am3 ^hmAr 1 )Di^{r 1 r 2 )^i 2m3 {r 2 )^i 3m A r 2) + 0{f% L ti 2 ), 

( l,m) 


(55) 
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where we have ignored the irrelevant constant piece and the linear terms since they will ultimately vanish when we 
project out the monopole mode. 

At this point it will be convenient to introduce the following function 

0i(r) = J k 2 dkP(k)ji(kr)Ai(k). (56) 

which is defined such that 

J r 2 drai(r)f3i(r) = Q. (57) 

Using the definition of D ; _1 (ri,r 2 ) we can show that 

ai(n) = J r2dr 2 D~ 1 (r 1 ,r2)/3i{r 2 ), (58) 

A(n) = J rjdr 2 Di(n,r 2 )ai(r 2 ) (59) 

these properties will be very useful in what follows. Note that the functions aAr) and 3i(r) are equivalent to b,(r) 
and bf(r) defined in 0. 

Now we are ready to perform the functional integration and find the non-Gaussian CMB PDF. We must “complete 
the square” of the following term 

-§ »-!)-[/ rldr 1 rldr 2 (-l) m 'l!i- m Df t (r 1 ,r 2 )'S’i m (r 2 ) - 2iBi_ m J r 2 drai{r)^i m {r)] (60) 

1,171 

This can be rewritten as 

-- / r idrirldr 2 (^i-m{ri) - (r 1 ,r 2 ){^im(r 2 ) - Cim(r 2 )) + Bi- m BimCi ], (61) 

Z l,m d 


where 


C im{r) = —i(3i(r)Bim. 

Once more we must “complete the square” for the following term 

— 2 yi(~l ) m [ClBl-mBlm + ZiBi-maim] 
l,m 


(62) 


(63) 


This can be rewritten as 


-1 £(-l nCK^l-m - m-m)(B lm - Vlm ) + ^ 


i&lr 


Ci 


(64) 


where 


mUr) = (65) 

Thus we are left with the correct Gaussian piece for the CMB PDF. Performing the two Gaussian functional integra¬ 
tions is equivalent to performing the substitution 

4 ’im{r) -» (66) 

into the non-Gaussian cubic portion of the PDF. 

This substitution is equivalent to the Weiner Filter solution proposed by 22] in order to estimate the underlying 
curvature perturbations from an observed CMB anisotropy map. The process of projection and radiative transfer is 
not invertible, but nevertheless we can regularize the inversion process by requiring that the reconstructed potential 
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minimizes the variance. This approach was adopted in |2^| . where it was shown that the optimal inversion procedure 
was to Weiner Filter the CMB map as 


,t, i \ $( r ) (an\ 

= —p — aim- (67) 

Here we have demonstrated that a straightforward functional integration of the Local Model PDF gives the same 
result. 

Performing the functional integration, or equivalently substituting the Weiner Filter solution for T into Eq. <t55l> . 
we find that the non-Gaussian cubic term in the PDF 

log P NG (a\f NL ,C) = Y Cm 2 m 3 f rldr 1 rldr 2 ^ hrni (r 1 )D- 1 {r 1 ,r 2 )^i 2m2 (r 2 )'i’i 3 rn 3 {r 2 ) (68) 

(Z,ra) 


E nhhh a hrn 1 ^l 2 rn 2 ^l 3 rn 3 / _2 

ym 1 m 2 m 3 

( l,m) 


E nhhl 

m\m 

( l,m)' 


3 

m 2 m 3 


C h c l2 c h 

^l\m\ Q , l 2 m 2 

c h c l2 c h 


r 1 dr 1 r 2 dr 2 f3 h ( n)D h (n, r 2 )Pi 2 (r 2 )f3 h ( r 2 ) 


K 


I 2 I 3 


where we define the reduced CMB Bispectrum as 


Khh = 2 / r 2 dr[a;i(r)/3; 2 (r)/3j 3 (r) + P h (r)ai 2 (r)/3i 3 (r) + Ph{r)Pi a (r)ai 3 {r)\, 


(69) 


and the prime indicates that we restrict the sum over li such that l\ < l 2 < I 3 . It is standard to separate the piece 
of the CMB three-point correlation function fixed by rotational invariance. We do this by introducing the Gaunt 
Integral, Eq. m, and the reduced CMB bispectrum, which are related to the CMB three-point correlation function 
as 


(ai \rn\&l 2 rri 2 Q j l 3 m 3 ) Q m\m 2 m 3 bhhh- ( 70 ) 

Combining these results we find that the scale-invariant non-Gaussian PDF for the CMB anisotropies, ignoring the 
constant piece, is 


logP(a\f NL ) = —zY 


a lm a lm 

Cl 


Inl Y 

( l,my 


nhhh 

y mi m 2 m 3 


K 


I 2 I 3 


C h c l2 c h 


O j l 2 m 2 a hm 3 + 0(f NLf i 2 


(71) 


Again this has the correct form to admit an optimal estimator, within the weakly non-Gaussian approximation, which 
is 


/jvl(o) = 


£ 

( l,m )' 


7 I 1 I 2 I 3 


^hhh 


C h c l2 c t 3 


Q'l 2 m 2 Q , l 3 m 3 5 


where the normalization constant is 


S n 


£ 


(2li + l)(2l 2 + 1 ) ( 2 Z 3 + 1 ) 

4-7T 


h h h \ 2 b hi 2 i 3 
000 ) C h C l2 C h 


(72) 


(73) 


A procedure based on this estimator has been implemented by the WMAP science team in their analysis of primor¬ 
dial non-Gaussianity. Their interpretation of the estimator is in terms of Weiner Filtered estimates of the primordial 
curvature. First the Weiner Filter /3i(r)/Ci is used to estimate the primordial curvature fluctuation H hm(r) from the 
CMB anisotropy a/ m , as in Eq. m- Then the two estimates of the primordial curvature perturbations are used to 
estimate the quadratic piece of the primordial curvature field according to the definition of the Local Model and cq (r) 
is used to calculate the CMB temperature anisotropy due to this quadratic piece. This quadratic CMB template is 
correlated with the observed CMB anisotropies to give an estimate, once properly normalized, of the non-Gaussianity 
of the observed CMB anisotropies. We have shown that this intuitive procedure results in an estimator that is optimal. 
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V. GENERAL NON-GAUSSIAN MODELS 

In the previous section we applied the basic ideas of estimation theory to show that the Local Model non-Gaussian 
CMB PDF admitted an optimal estimator in the weakly non-Gaussian limit. As mentioned in the Introduction the 
Local Model is a physically motivated idealization, however most primordial bispectra calculated within models of 
the early universe contain additional terms. Starting with the definition of the Local Model, Eq. 0 , we derived the 
resulting CMB PDF. Since it is not possible to find the general non-Gaussian PDF which corresponds to a general 
non-Gaussian bispectrum, we are unable to retrace the above steps in order to extend our analysis. However we will 
argue that the non-Gaussian CMB PDF in Eq. GD is simply the Edgeworth expansion, which holds for arbitrary 
non-Gaussian CMB PDF, and therefore our conclusion that the bispectrum estimator is optimal holds for arbitrary 
model of the primordial non-Gaussianity. Due to the arguments in the Introduction we assume that a model for 
primordial non-Gaussianity is best characterized by its bispectrum and that we should try to constrain the amplitude 
of the bispectrum. We choose to define the amplitude of a general bispectrum as the coefficient of the bispectrum 
evaluated in the equilateral configuration (see [j| for a full discussion of this point). 

The Edgeworth expansion is a way to express the non-Gaussianity of a PDF in the form of a series expansion 
(see @J and references there within). This allows one to explicitly write down a non-Gaussian PDF if its lowest 
order moments or correlation functions are known. The Edgeworth exp ansion of a 1-D PDF is a simple expansion in 
Hermite polynomials; a multivariate generalization has been found j3j,|3J]. Adopting the notation relevant for CMB 
anisotropies the Edgeworth expansion is 


-P(a) [1 2 ' (o J i- irn , 3 cii 2rr i 2 (ii 3rn3 ') 


d 


d 


d 


( l,m)' 


ddlimi $0*1 2 m 2 dai 3 

r 


m 

lm 


2 C t 




(74) 


where (ai l7ni ai 2rn2 ai 3m3 ) can still be decomposed into the Gaunt Integral and the reduced bispectrum. Here the 
general reduced bispectrum can be calculated as 

b hhi 3 = (^) 3 J kldkikldk 2 kldk 3 J x 2 dxj h {k 1 x)j l2 (k 2 x)ji 3 (k 3 x)B(k 1 , k 2 , k 3 )A h (k 1 )Ai 2 (k 2 )Ai 3 (k 3 ), (75) 

where B(k\, k 2 , k 3 ) is the general primordial bispectrum Q|. When the primordial bispectrum is calculated according 
to the Local Model the CMB reduced bispectrum is given by Eq. 

Performing the functional differentiation in the Edgeworth expansion we find 


P(a) = n 


a lm a lr 

e 2C i 

, V 2 nCi 

lm v 1 
O/i 


h 1 h rhhh r a lim 1 0,i 2 m 2 ai 3m3 

■[I + 2^ W1M3P 1 - 


( l,m)' 


- (-1 r 3 (^-s W3 6 m2 ,. m3 + 


• 2‘3 7712777,3 

a l 2 rri 2 


Cn.Cn 


c h c l 2 c h 

X X I tt| 3™3 

vh,l 3 <Jm lt -m 3 T „ „ 


dll . / 2 hr. 


(76) 


Again the properties of the Gaunt Integral force the linear terms to be proportional to the monopole anisotropy 
mode, aoo- After projecting out the monopole anisotropy the form of the non-Gaussian cubic term in the Edgeworth 
expansion is identitical to the form of non-Gaussian cubic term in the Local Model CMB PDF, Eq. CD- 

In the previous section we argued that estimators based on the bispectrum are optimal if the PDF can be expressed 
as in Eq. CD- This form is not unique to the Local Model, but is simply part of the Edgeworth expansion which is 
relevant regardless the form of primordial bispectrum. The only necessary condition is that the level of non-Gaussianity 
is small. Thus we conclude that estimators based on the bispectrum are optimal for any form of primordial non- 
Gaussianity characterized by its bispectrum provided the amplitude of the non-Gaussianity is sufficiently small. 


VI. DISCUSSION 

We analyzed the standard model for primordial non-Gaussianity with the tools of estimation theory and found that 
the estimator constructed out of the CMB three-point correlation function is an optimal estimator for / nl , the ampli¬ 
tude of the primordial non-Gaussianity. Our conclusion is only true within the weakly non-Gaussian approximation, 
which implies that we ignore non-Gaussian contributions to four-point correlation function compared to the much 
larger Gaussian contributions. In our calculations this is equivalent to ignoring terms proportional to f^ L {^(x) 2 ), 
which is already constrained by the WMAP data to be extremely small. Therefore we can consider the standard 
WMAP estimator, which is based on the CMB three-point correlation function, to be optimal in practice. This 
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property only depends on the weak level of non-Gaussianity, not on any assumed form for the primordial bispectrum. 
Therefore we argued that the amplitude of a general primordial bispectrum can be optimally estimated. 

Our calculations demonstrated that the WMAP estimator was optimal to constrain /jvl and therefore contained 
all the information in the observed data on this parameter. Future work can now focus on practical implementations 
of this optimal estimator. Some of the practical concerns that affect the implementation of an estimator for f^L 
include biasing due to non-Gaussianity from secondary anisotropies, non-uniform instrument noise and the need to 
jointly estimate /nl and the basic cosmological parameters from the same data which will degrade the performance 
of the estimator. While we demonstrated that the WMAP estimator is optimal for the estimation of /nl, there still 
is a need to find a quick method to optimally estimate individual three-point correlation function modes from CMB 
maps with Galactic foreground contamination and inhomogeneous instrument noise [24L l28l . We implicitly assumed 
that the three-point correlation function modes could simply be calculated from observed CMB maps. For maps with 
inhomogeneous noise, this procedure results in sub-optimal error bars on the individual bispectra modes. An optimal 
procedure, which is quite slow for large data sets, has been developed [2j3j; a quicker procedure must be found if it 
can be realistically applied to the WMAP and Planck data sets. 

Also, the non-Gaussianity of secondary anisotropies is not well known on the areminute scales relevant for current 
and future CMB experiments. Fortunately on intermediate scales there is little contamination of the primordial non- 
Gaussianity estimator by secondary anisotropies; however we simply do not know if this is true on the extremely small 
scales relevant for upcoming CMB experiments. Also we do not know how errors in the basic cosmological parameter, 
from which we calculate the radiation transfer functions and the Weiner Filters, will affect this analysis. This topic 
should also be investigated. 

While much progress has been made in both the theoretical understanding and practical analysis of the signatures 
of primordial non-Gaussianity in CMB maps, there still is much more work needed to be done before the field will 
become fully developed. However the tremendous potential for insight into the production mechanism of the primordial 
curvature perturbations makes this work worthwhile. 
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APPENDIX A: SIGNAL-TO-NOISE RATIO OF HIGHER-ORDER ESTIMATORS 

In this Appendix we demonstrate the features of the S/N of an estimator based on an n th -order correlation function 
needed to argue that the estimator based on the three-point correlation function would have the largest S/N. We will 
do our calculation within a toy model that should retain all of the important features of the problem. In what follows 
we ignore radiative transfer and the curvature of the sky, so we will simply observe the underlying modes within the 
flat-sky approximation [36j. 

Extending the basic idea of the “Local Model” we will assume that the observed non-Gaussian field T can be 
expanded in the Gaussian field <I> as 

T = $ + / 2 [$ 2 - ($ 2 )] + / 3 [$ 3 - ($ 3 )] + • • • + /„[$" - ($ n )] + • • • . (Al) 

The various n-point correlations function of 4/ can be calculated by substituting our definition of 4/, Eq. EH, and 
using Wick’s Theorem. It is important to note that we are interested in connected correlation functions, which contain 
a single delta-function. In what follows we ignore all 0(1) factors. 

Working within our model, the power spectrum is evaluated as (36] 

C(l) ~ (A2) 


and the connected ?z t?l -order correlation function as 

T(h,l 2 , •••,«„)- (fn—l +/L2 + --- + fs~ S + ■■■ / 2 "~ V 


i Z 2 + ^ H-+ Z 2 


(A3) 


where A is the amplitude of the primordial curvature power spectrum, P(k) = A/k 3 . Note that there is no contribution 
to 7i - order correlation function from /„. 
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We also define the total ( S/N) 2 of the estimator as 

( *, J2, J2, + h + ■ ■ ■ + ln)T(h,h, - ■ ■ ,l„)] 2 

( w ) ~J dhdl2-dl n -_______- (A4) 

Substituting the results for C; and T(l\, l 2 , • • • , l n ) into Eq. (IA4I) we find 

(■^■) 2 ~ fskyftot^ 1 “ J d 2 lid 2 l 2 ■ ■ ■ d 2 l n 6^ 2 \li + l 2 H-+ Z„) —— jfj- 2 - p —— ■ (A5) 

where f s k y is the fraction of the observed sky and f to t = fn-i + fn -2 + ■ ■ • + /™ _s + • • • f 2 ~ 2 is the total coupling 
cofficient that results from considering all contribution to a given correlation function. 

The upper bound of integration for all li is / mol so we can rescale the integral to be “dimensionless,” then we find 

(|) 2 ~ fskyfl t A n - 2 l 2 max = f 2 ot A n ~ 2 N pix , (A6) 


where we defined N p i x to be the total number of observed pixels. In addition to numerical factors, the integral may 
also contribute factors of the “Coulomb Logarithm,” i.e. h(! moj /l m ;„), as found in the bispectrum calculation |.‘>fil| . 
These additional correction are small, so we can argue that to 0(1), the S/N is simply the product of the characteristic 
amplitude and the square-root of the number of observed pixels. We have established the necessary results used in the 
Introduction where we argued that the S/N of the estimator constructed out of the three-point correlation functions 
will be dominant. 
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